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Abstract 



We present a statistical model for the quantitative peak informa- 
tion obtained from an electropherogram of a forensic DNA sample. 
Our model directly describes peak height information and the dropout 
of an allele is interpreted as failure for its associated peak to be ob- 
served above a detection threshold. Stutter and dropin are readily 
represented in our model. The parameters of the model, and their 
standard errors, are estimated by maximum likelihood in the pres- 
ence of multiple unknown contributors, exploiting a Bayesian network 
representation for efficient computation. A case example from the lit- 
erature is used to illustrate the efficacy of the model, both in finding 
likelihood ratios for evidential calculations, and in the deconvolution 
of mixtures for the purpose of finding likely profiles of one or more un- 
known contributors to a DNA mixture. Our model is readily extended 
to simultaneous analysis of more than one mixture as illustrated in a 
case example. We show that combination of evidence from several 
traces may give an evidential strength close to that of a single source 
trace and thus modelling of peak height information provides for a 
very efficient mixture analysis. 

Key words: allelic dropout; Bayesian networks; DNA profiles; forensic 
statistics; gamma distribution; mixture deconvolution; silent alleles, strength 
of evidence, stutter. 
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1 Introduction 



Much effort is devoted by forensic and other scientists to address the chal- 
lenges that arise in analysing DNA samples recovered from crime scenes. As 
detailed in Section fl~2| some approaches make use only of the presence or 
absence of allelic peaks in an electropherogram or use the peak heights in a 
semi-quantitative fashion; other approaches seek to develop fully quantitative 
models of observed peak heights. 

This paper presents a new statistical model for the peak heights of an 



electropherogram. It builds upon earlier models presented in (Cowell et al. 



2007a[b 2011) but is simpler than these, eliminating discretizations of some 



continuous parameters that were previously introduced, (for example, the 
relative amounts of DNA from contributors) to make computations feasible. 
In contrast to the present paper, these papers model normalized rather than 
absolute peak sizes. Dropout of an allele that is expected to be observed can 
then be interpreted as, and modelled by, a failure for the associated peak 
to be observed above a threshold. Spontaneous dropin is modelled by small 
amounts of DNA from contributors of unknown genotypes. 

The simplifications of the model combined with an efficient Bayesian net- 
work representation enables fast computation and permits analysis of mix- 
tures in which the presence of several unknown contributors is posited; in 
particular, estimation of parameters by maximum likelihood becomes feasi- 
ble. The example in this paper has been analysed in detail with two unknown 
contributors, but maximized likelihood ratios have also been calculated with 
up to six unknown contributors, see Section HI 

The plan of the paper is as follows. In the next section we present back- 
ground information concerning DNA relevant for mixture analyses carried 
out by forensic scientists. We summarize the measurement processes carried 
out for quantifying DNA mixtures, and the artefacts that can arise in these 
processes and lead to difficulties in their interpretation. We then present a re- 
view of related work of DNA mixture modelling. A case example taken from 
the literature is presented in Section [T73l that we shall analyse in the remain- 
der of the paper. In Section |2] we present our basic model for peak heights 
and its elaboration with inclusion of various artefacts. In Section [3] we ap- 
ply our model to the case presented in Section 11.31 estimating the unknown 
parameters in our model by maximum likelihood using an efficient Bayesian 
network representation. Section 13.31 is devoted to showing how the modu- 
larity of the Bayesian network representation may be used to elicit further 
details in the analysis of a mixture, for example by finding the probability 
that a particular peak is a stutter peak; it is also shown how the networks 
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may be extended for other purposes. In Section |4] we discuss suggestions for 
future work. 



1.1 DNA mixtures 



In this section we describe the nature of the information that we analyse with 
our DNA mixture model and give a brief description of the DNA amplification 
process and associated measurements as carried out in forensic laboratories. 
We also introduce some of the complications that can occur. An excellent in 
depth description may be found in Butler (2005). 



1.1.1 Short Tandem Repeat (STR) markers 

Forensic scientists encode an individual's genetic profile using the composi- 
tion of DNA at various positions on the chromosomes. A specific position 
on a chromosome is called a locus, or marker. The information at each locus 
consists of an unordered pair of alleles which forms the genotype at that 
locus; a pair because chromosomes come in pairs (one inherited from the 
father, the other from the mother), and unordered because it is not recorded 
from which chromosome of the pair each allele originates. Human DNA has 
twenty three pairs of chromosomes: twenty two autosomal chromosome pairs 
and a sex-linked pair. 

The loci used for forensic identification have been chosen for two reasons. 
The first reason is that at each locus there is a wide variability between in- 
dividuals in the alleles that may be observed. This variability can therefore 
be exploited to differentiate people. The second reason is that each locus 
is either on distinct chromosomes, or if a pair of loci are on the same chro- 
mosome then they are widely separated. This means that the loci may be 
treated as mutually independent. 

The alleles of a marker are sequences of the four amino acid nucleotides 
adenine, cytosine, guanine and thymine, which we represent by the letters 
A, C, G and T. Each amino acid is also called a base, and because the DNA 
molecule has a double-helix structure, each amino acid on one strand is linked 
to a complementary amino acid on the other strand; a complementary pair 
of amino acids is called a base pair. 

The value of an allele is typically represented by its repeat number, usually 
an integer. For example, consider an allele with repeat number 5 (commonly 
also referred to as allele 5 for brevity) of the marker TH01. This allele 
consists of the sequence of alleles AATG repeated consecutively five times. 
It can be designated by the formula [A AGT] 5. Likewise allele 8 of TH01 has 
eight consecutive repetitions of the AATG sequence, which may be denoted 
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by [yL4GT]g. Repeat numbers are not always integers. For example, allele 
8.3 of TH01 has the chemical sequence [AAGT} 5 AGT[AAGT] 3 , in which '8' 
refers to the eight complete four-word bases [AAGT] and the '.3' refers to 
the three base-long word sequence AGT in the middle. Repeat numbers 
with decimal M' and '.2' endings are also possible, indicating the presence 
of a word of one or two bases. Note that although the integer part of the 
repeat number counts how many complete words of four bases make up the 
allele sequence, the words need not be all identical. For example, allele 11 
of the marker VWA has the base sequence TCTA [T CTG] 3 [TCTA } 7 . The 
base-letter sequences for many alleles may be found in |Butler| ( |2005[ ). 

When the repeat numbers of the two alleles of an individual at a marker 
are the same, then the genotype for that marker is said to be homozygous; 
when the repeat numbers differ, the genotype for that marker is said to be 
heterozygous. 

The repetitive structure in the alleles gives rise to the term short tandem 
repeat (STR) marker to describe these loci; they also go by the name of 
micro satellites. Within a population the various alleles of STR markers do 
not occur equally often, some can be quite common and some quite rare. 
When carrying out probability calculations based on DNA, forensic scientists 
use estimates of allele frequencies or probabilities based on the profiles of a 
sample of individuals. The sample sizes typically range from a few hundred 
to thousands of individuals. For example, Butler et al. (2003) presents tables 
of US-population STR-allele frequencies for Caucasian, African-Americans 
and Hispanics based on sample sizes of 302, 258 and 140 individuals. 



1.1.2 The PCR process 

The DNA collected from a crime scene for forensic analysis consists of a num- 
ber of human cells from one or more contributors. Note that each individual 
cell will contain two alleles for each marker of interest. This means that a 
particular contributor will contribute the same total number of alleles for 
each marker. In order to identify the alleles that are present, a DNA sample 
is first subjected to chemical reagents that break down the cell walls so that 
the individual chromosomes are released into a solution. A small amount 
of this solution is used to quantify the concentration of DNA; the typical 
unit of measurement is picograms per micro-litre, a single human cell has a 
mass of around six picograms. Having determined the density of DNA in the 
sample, a volume is extracted that is estimated to contain a certain quantity 
of DNA, typically around 0.5 nanograms, equivalent to around 100 human 
cells. The DNA in the extract is then amplified using the polymerase chain 
reaction (PCR) process. This involves adding primers and other biochemi- 
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cals to the extract, and then subjecting it to a number of rapid heating and 
cooling cycles. Heating the extract has the effect of splitting apart the two 
complementary strands of DNA making up the markers, the cooling phase 
then allows free floating amino acids to bind with these individual strands 
in such a way that the DNA is copied. By the action of repeated heating 
and cooling cycles, typically around 28 altogether, an initially small quantity 
of DNA is amplified to an amount large enough for accurate quantification. 
Mathematically, the amplification process may be modelled as a branching 
process (Sun 1995; Stolovitzky and Cecchi, 1996). The amplification process 
is not 100% efficient, that is, not every allele gets copied in each cycle. This 
means that if two distinct alleles in a marker are present in the extract in the 
same amount prior to amplification, they will occur in different amounts at 
the end of the PCR process. In our model we capture this variability using 
gamma distributions. 

Note that after breaking down the cell walls to release the chromosomes 
there could be sufficient DNA in the sample to carry out PCR amplifications 
with several extracts. When this is done it is called a replicate run. 

To understand the quantification stage of the post PCR amplified DNA, 
it is important to know that the amplification process does not copy only the 
repeated DNA word segment of a marker, it also copies DNA at either end. 
These are called flanking regions, and their presence is important in perform- 
ing the PCR process. Thus an amplified allele will consist of the allele word 
sequence itself and two flanking regions, and will have a length associated 
with it, which is measured in the total number of base pairs included in the 
word sequence and the flanking regions. For each marker the size of each 
of the two flanking regions is constant, but different across markers. Thus 
quantifying a certain allele is equivalent to measuring how much DNA is 
present of a certain size. This is carried out by the process of electrophoresis, 
as follows. 

The flanking regions have attached to them a fluorescent dye. The am- 
plified DNA is drawn up electrostatically through a fine capillary to pass 
through a light detector, which illuminates the DNA with a laser and mea- 
sures the amount of fluorescence generated. The latter is then an indication 
of the number of alleles tagged with the fluorescent marker. The alleles 
are not all drawn up together, instead the longer alleles are drawn up more 
slowly, however alleles of the same length are drawn up together. This means 
that the intensity of the detected fluorescence will sharply peak as a group 
of alleles of the same length passes the light detector, and the value of the 
intensity will be a measure of the number of alleles that pass. The detecting 
apparatus thus measures a time series of fluorescent intensity, but it converts 
the time variable into an equivalent base pair length variable. The data may 
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be presented as an electropherogram (EPG), in which the horizontal axis 
gives the base pair measurement, and the vertical axis the light intensity, see 
for example Fig. [TJ 
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Figure 1: An electropherogram (EPG) using the Identifiler marker system. 
This image is supplied courtesy of the Laboratorio di Genetica Forense, Uni- 
versita di Perugia. 



A peak in the EPG indicates the presence of an allele, the height is a 
measure of the amount of the allele in the amplified sample, the measure- 
ment is expressed in relative fluorescence units (rfu); the area of the peak is 
another measure of the amount, but this is highly correlated with the height 
(Tvedebrink et al. , 2010). Both peak height and peak area are determined by 
software in the detecting apparatus. Several different colour fluorescent dyes 
are used, which allows the simultaneous detection and resolution of alleles 
from different markers that have the same length. These are displayed and 
labelled in the electropherogram as different dye lanes. 

We shall call the peak size information extracted from the EPG the profile 
of the DNA sample, or more briefly, the DNA profile. Commonly, DNA 
profile also refers to the combined genotype of a person across all markers. 

In measuring the peak heights there can be a lot of low level noise indi- 
cated by small peaks. Thus a peak amplitude threshold may be set by the 
forensic analyst whereby peaks below the threshold level are discarded. Thus 
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an allele present in the DNA sample will not be recorded as observed if the 
peak it generates is below the threshold; when this happens a dropout of the 
allele is said to have occurred. A dropout can also occur simply by sporadic 
failure of the apparatus. 

Dropout is one of the complications called artefacts that can make the 
analysis of DNA samples difficult. Another common artefact is stutter, 
whereby an allele that is present in the sample is mis-copied at some stage 
in the amplification process, and a four base pair word segment is omitted; 
more rarely other stutter patterns occur. This damaged copy itself takes part 
in the amplification process, and so yields a peak four base pairs away from 
the allele from which it arose. 

Another artefact is known as dropin, which is due to sporadic contami- 
nation of a sample (either at source or in the forensic laboratory) by a very 
low amount of DNA which can lead to small unexpected peaks in the elec- 
tropherogram. 

Current technology allows for the amplification of very little DNA mate- 
rial, even as little as contained within one cell. Amplifications of low amounts 
of DNA are termed low template (or copy number) DNA (LTD). LTD analy- 
ses use an increased number of cycles in the PCR process and are particularly 
prone to artefacts such as dropin and dropout. 

Finally, a mutation in either the primer binding region, the flanking re- 
gion, or the repeat region, can result in the allele not being picked up at all 
by the PCR process, in which case we say that the allele is silent. An allele 
can also be silent because its length is off-scale and the peak does therefore 
not appear in the EPG. 

1.2 Related literature 

Methods for the interpretation of DNA profiles arising from mixtures can be 
classified into: 

a) those mostly based on qualitative information, i.e. information about 
allele presence or absence in the DNA mixture; 

b) those that also use quantitative information by taking into account both 
allele presence and peak sizes. 

We will first discuss some recent relevant papers that base their model 
prevalently on allele presence and secondly we describe those that also ex- 
ploit the information in the peak sizes. For an exhaustive list of papers on 
DNA mixtures we refer the reader to references in cited papers. A list of 
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publications relating to DNA mixtures is maintained at the website of the 
National Institute of Standards and Technology (NISTj^] 



Methods based on qualitative information Recently, the International 
Society for Forensic Genetics published recommendations based on discrete 



models for forensic analysis of single source DNA profiles (Gill et al. 2012). 



They consider a single locus (DNA marker) and at that locus allow for arte- 
facts such as dropin and dropout, including the probability of their occurrence 
in the likelihood ratio computations. Under the independence assumptions 
of this model, they state that it readily extends to multiple loci and DNA 
mixtures. Gill et al. (2012) is largely based on Curran et al. (2005) and 



Balding and Buckleton 



Curran et al. (2005 



(2009) 



give a set based method for likelihood ratio calcula- 
tions which includes multiple contributors and the analysis of replicate runs. 
This was an improvement on previous guidelines for the analysis of low tem- 
plate DNA profiles (LTD) where alleles were not reported unless they were 
duplicated in replicate runs ( Gill et al. , 2000 ) . They consider the probability 



of contamination and dropout, but not of stutter. The model is implemented 
inlGill et all (120071). For the interpretation of LTD, IGill et all (120081) include 



the probabilities of dropout, contamination and stutter; peaks in stutter 
positions are considered ambiguous alleles and included in the calculation. 
Furthermore, they account for dropin by increasing the number of potential 
contributors to the mixture. They also study the robustness of the LR to 
misleading evidence in favour of the prosecution. 



Balding and Buckleton (2009) consider a discrete model for interpreting 
LTD mixtures where peaks are classified as present, in stutter position, or 
masking. The set of masking alleles is defined as every peak above a desig- 
nated threshold that either corresponds to an allele of a known contributor, 
or is in a stutter position to that allele. Based on this paper, Balding wrote 
a suite of R functions likeLTD^] built for a range of crime scene DNA pro- 
files, involving complex mixtures, uncertain allele designations, dropin and 
dropout, degradation, stutter, relatedness of alternative possible contribu- 
tors, as well as replicated runs. Alleles are defined as present, absent, or 



uncertain; dropout is modelled using Tvedebrink et al. (2009 2012a). Haned 



et al. (2011) use different logistic models for haploid and diploid cells, to- 



gether with a simulation model of the PCR process (Gill et al. 2005) to 
estimate dropout probabilities. Similarly, Haned et al. (2012) and Haned 



and Gill (2011) adopt a model based on allele presence that analyses the 



■"■http : //www. cstl .nist .gov/ strbase/mixture .htm 

2 https : //baldingstatisticalgenetics/ sof tware/likeltd-r-f orensic-dna-r-code 
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sensitivity to dropin and dropout parameters. The papers mentioned in this 
paragraph do not make direct use of peak heights, but peak heights enter 
implicitly in the process of allele classification. 



Methods based on quantitative information Early attempts to ex- 



ploit peak size information include . 


5vett et al. 


(1998 


); 


Perlin and Szabady 


(2001) 


; Wang et al. 


(2006) 


; Bill et al.| 


(2005 


); and 


Curran 


(2008 


)• 


Cowell et al. 



(2007a) introduced the use of gamma distributions to model peak area vari- 
ability observed in the PCR amplification of DNA in mixtures, and presented 
a normal-approximation version of the model in Cowell et al. (2007b). Both 
of these papers carry out calculations using exact propagation algorithms for 
Bayesian networks, but this required discretization of a parameter represent- 
ing the relative amounts of DNA from each contributor to the mixture. Both 
evidential and deconvolution analyses were presented. The suitability of the 
gamma distribution was investigated in Cowell (2009), using the simulation 



model of Gill et al. (2005). 



The gamma model of Cowell et al. (2007a) was extended in Cowell et al. 



(2011) to handle the artefacts of stutter, dropout and silent alleles. Stutter 
was implemented in Bayesian networks by means of a discretization to allow 
probability propagation. The authors showed how Bayesian network repre- 
sentation readily enabled the simultaneous analysis of two DNA mixtures 
that are thought to have contributors in common. The authors noted that 
the computational complexity of the model increased significantly with the 
total number of contributors, limiting the practical use of their methods to 
at most four contributors. 



Perlin et al. (2011) present a model for DNA mixture interpretation ex- 
tremely similar to the normal model used in Cowell et al. (2007b). They use 
a fully Bayesian approach where all parameters are given a prior distribution. 
The model is used both for identification of contributors to a DNA mixture 
and for deconvolution. The model does not incorporate stutter, and dropout 
is accounted for by a background variance parameter. The performance of 
the model is reported for 16 two-person mixture samples, however the per- 
formance for mixtures of DNA from more than two people is not reported. 

The model of Puch-Solis et al. (2012) is based on the gamma distributed 
peak height model of Cowell et al. (2007a 2011). Since DNA degrades more 



the greater the DNA fragment length (Tvedebrink et al. , 2012b), they use a 
correction for this phenomenon. The model they adopt is different to the one 
presented here in that: i) they use peak heights expressed as fractions of the 
total peak height; ii) they use a pre-processing step for classifying peaks as 
stutter peaks; iii) they use a more complex dropout model; iv) they discretise 
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the parameter representing the proportion of DNA from each contributor and 
use a finer grid on extreme intervals to better capture unbalanced mixtures; 
v) all alleles not corresponding to peaks above threshold are lumped together 
into a single compound allele. In addition they only discuss identification 
of contributors to the mixture and not deconvolution. The paper presents 
examples of two person mixtures and does not analyse replicate runs or 
multiple traces. 



Tvedebrink et al. (2010) evaluate the weight of evidence for two person 



mixtures, using a multivariate normal distribution of peak heights. Based on 
DNA mixtures from controlled experiments they find a linear relationship be- 
tween peak height and area and between means and variances of peak height 
measurements. Controlled experiments are also used for estimating the prob- 



ability of allelic dropout per locus using logistic regression (Tvedebrink et al. 



2009, 2012a). 



1.3 Motivating example 



As a motivating example we shall briefly consider from IGill et al. 



(2008), also analysed in Cowell et al. (2011): 



'An incident had occurred in a public house where the deceased 
had spent the evening with some friends. There was an alterca- 
tion in the car park between the deceased (Ki) and several others 
resulting in the death of the victim. The alleged offenders then 
left the scene and went to another public house where they were 
seen to go into the lavatory to clean themselves." 

Two blood stains, MC18 and MC15 were found at the pub lavatory and 
were typed. Both indicated that they were DNA mixtures of at least three 
individuals. The genotypes of the defendant, K 3 , and a known individual, 
K2, alleged to be present at the time of the offence, were determined together 
with that of the victim K\ (all were males). An excerpt of the observed DNA 
profiles of the samples and individual genotypes are displayed in Table [U The 



complete data can be found in Gill et al. (2008) and Cowell et al. (2011). 



Note that if contributors are K\, K 2 & K 3 , the peak at allele 22 of marker 
D2 in MC18 would need to be a stutter and K 2 , s allele 9 on marker TH01 
would have dropped out of the MC15 profile. 



1.4 Objectives of analysis 

The analysis of a DNA profile can have different objectives depending on the 
context. The objective can be a quantification of the strength of evidence for 
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Table 1: Excerpt of peak heights and genotypes of individuals for the pub 
crime case. 



Marker Alleles MC18 MC15 K x K 2 K 3 



D2 


16 


189 


64 






16 




17 


171 


96 






17 




22 


55 













23 


638 


507 


23 








24 


673 


524 


24 


24 




D16 


11 


534 


256 




11 


11 




12 


1786 


1724 


12 


12 






13 


265 


109 






13 


TH01 


7 


670 


727 


7 








8 


636 


625 


8 








9 


99 







9 






9.3 


348 


165 






9.3 



a given hypothesis over another, or the objective may be a deconvolution of 
the profile, i.e. one wishes to identify likely genotypes of contributors (Perlin 



and Szabady, 2001 Wang et ah, 2006). These objectives are not completely 



separate, as strong evidence for a particular individual being a contributor 
typically corresponds to an individual with this particular genotype being a 
likely contributor to the profile. We briefly describe the typical situations 
below. 



Evidential strength The available evidence E consists of the peak heights 
as observed in the EPG as well as the combined genotypes of the known 
individuals. It is customary to assume relevant population gene frequencies 
to be known. We shall return to this issue in Section HI 

Typically there will be at least two competing hypotheses involved. For 
example, in the case above the prosecution hypothesis H p could be that the 
profile has exactly three contributors who are identical to the three known 
individuals K 1 , K 2 , and K 3 . 

In contrast, the defence hypothesis H d could be that the trace contains the 
DNA of K\ and K 2 , but also one or more unknown individuals who we shall 
name Ui, U 2 , U 3 , ■ ■ ., whereas K 3 has not contributed. The genotypes of the 
unknown contributors are assumed to be chosen randomly and independently 
from a population with known genotypes. 
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The strength of the evidence is then (Evett et al. , 1991 Weir et al. , 1997 
Gill et al. , 2012) reported as a likelihood ratio: 



L(H p ) _ Pr(E\H p ) 
L(H d ) Pv(E\H d y 

The calculation of this evidence can be important both when presenting a 
case to the court, but also in the investigative phase of a trial, to decide 
whether it is worthwhile to search for additional independent evidence. 



Deconvolution of DNA profiles This calculation attempts to identify 
the combined genotypes across all markers of each unknown contributor to 
the mixture and give a list of potential genotypes of a perpetrator to use for 
a database search. For example, we could wish to calculate 

Pr{£/ ls U 2 | E, H iQV } or Pr^ | E, H inv }, 

where Ui, 11% represent genotypes of unknown contributors to the mixture 
under an investigative hypothesis H[ nv that the trace contains DNA from 
three persons, one of whom is the victim K\. The calculation should be 
done for a range of most probable combinations of genotypes of the unknown 
contributors. 



1.5 Evidential efficiency 

We should like to emphasize that evidence against a suspect K s based on 
our model can never be stronger than the inverse match probability for K s 
obtained by a single source DNA profile. 

To see this, let ti s = Pv(U = K s ) denote the match probability, i.e. the 
probability that a random individual has the genotype of the suspect K s . 
Further, let H p be the prosecution hypothesis involving K s as a contributor 
and H d the defence hypothesis, replacing K s by a random individual U. We 
then have 

Pr(£ | H p ) Pr(E | H p ) 

Pr(E\H d ) " Pt(E \H d ,U = K s )ir s + (1 - tt s ) Pt(E \H d ,U? K s ) 

Pr(£ | H p ) 

Pr(£ | H p )ir s + (1 - tt s ) Pr(E \H d ,U^ K s ) 
P?(E\H P ) 1 



< 



Pr{E | H p )tc s 7T s 



Thus, a mixed trace can never give stronger evidence than a high-quality 
trace from a single source. We define the specific evidential efficiency of E 
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against K s as 

Pr(£ | H p )/ Pi(E | H A ) Pr(£ \ H p )ir s 



Q(E | K s 



1/tt, Pr(£ | # d ) 



This is a number between and 1 which indicates how close the evidence 
against K s is to the maximal possible. Indeed, the loss of efficiency for a 
mixed trace can be compared to the loss by failure of identifying the genotype 
for some markers, as this increases the match probability in a similar way. 
For the case example in Section II. 3[ using the US Caucasian gene frequencies 
of Butler et al. (2003), the inverse match probability for K$ is — logio — 
14.45, whereas if we ignored, say, markers D18 and D19 the inverse match 
probability would be — log 10 7Tx 3 = 12.05. 

In the case of deconvolution where we do not have a specified suspect, 
we could consider the evidence against the posterior most likely unknown 
person under the defence hypothesis K*. If H* denotes the prosecution 
hypothesis with U replaced by K* and tc* = Pv(U = K* | Ha) denotes the 
prior probability of a random individual having genotype K*, we would get 

Vx{E\Hl)/Vx{E\H d ) 



Q(E | K*) 



1/tt* 

Pr(£ \U = K*, H d ) Pv(U = K*\H d 
Pr(£ | H d ) 



Pr{U = K*\H d ,E). 



Thus, this generic evidential efficiency is equal to the maximum posterior 
probability obtained in the deconvolution. For a perfect single source trace, 
the generic evidential efficiency is 100%. 



2 A gamma model with artefacts 



In this section we present an overview of the basic model for peak heights, 



which is based on the gamma model described in Cowell et al. (2007a) and 



used in Cowell et al. (2011) with extensions describing artefacts. It should 



be emphasized that the model used here is different in two main respects: 
firstly it uses absolute peak heights instead of relative peak heights which 
enables direct treatment of dropout; secondly, it has a simpler and more 
realistic description of artefacts. As a consequence, computations are greatly 
simplified so analysis is possible for a high number of unknown potential 
contributors. 
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2.1 Basic model 



We consider / potential contributors to a DNA mixture. Let there be M 
markers used in the analysis of the mixture with marker m having A m allelic 
types, m = 1, . . . , M. Let denote the fraction of DNA from individual i 
prior to PCR amplification, with = (0i, 02, • • • , 4>i) denoting the vector of 
fractions from all contributors. Thus 0j > and Yn=i 4>i — 1- F° r identifia- 
bility of the parameters we further assume that the fractions 0j for unknown 
contributors are listed in non-increasing order, i.e. <p a > <pb if & < b are both 
unknown individuals. It is assumed that these pre-amplification fractions of 
DNA are constant across markers. 

For a specific marker m and allele a, the model is describing the observed 
peak height H a . Ignoring artefacts for the moment, the model makes the 
following further assumptions: Each contribution H ia from an individual i 
to the peak height at allele a has a gamma distribution, H ia ~ r(p0j7ij o , 77), 
where T(a,f3) denotes distribution with density 

f(h \a,/3) = Ssn e~ h/ ? for h > 0. 

For a = 0, r(0,/3) is the distribution degenerate at 0. Here p is proportional 
to the total amount of DNA in the mixture prior to amplification; the number 
of alleles of type a carried by individual i is denoted by n ia ; and the parameter 
f] determines the scale. 

We have suppressed the dependence of the heights Hj™ on markers and 
other quantities, including the parameters. It is important for the model 
that (pi are the same for all markers, whereas one would typically expect 
other parameters to be marker dependent and possibly dependent on frag- 



ment length in case of degraded DNA (Tvedebrink et al. 2012b). To avoid 
cluttering of notation and formulae we shall largely ignore this dependence 
in the following. 

It becomes practical to introduce the effective number B a (<fr, n) of alleles 
of type a (ignoring stutter) where B a ((p,n) = J2i<f>inia, and n = (rii a ,i G 
I, a G A); we note that as — 1, we have X)a-Sa(0) n ) = 2 for all 

markers. As a sum of independent gamma distributed contributions, the 
peak height H a at allele a is gamma distributed as 

if o ~r{ Pj B a (0,n),77}. 

The distribution of H a has expectation pqB a ((j), n) and variance pr] 2 B a ((p, n). 
Thus H a is approximately proportional to the amount of DNA of type a and 
the variance is proportional to the mean. We let p = prj and a = 1/ yfp. Then 
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- in a trace with only one heterozygous contributor and no artefacts — p 
is the mean peak height and a the coefficient of variation for peak heights; 
hence a becomes a measure of generic peak imbalance. In the presentation 
of results we shall often use (a, p) instead of (p, rf) because of their direct 
interpretability. 



2.2 Incorporating artefacts 

The model described above does not incorporate a number of important 
artefacts as described in Section [TTT1 This section is devoted to necessary 
modifications. 

Stutter We represent stutter by decomposing the individual contributions 
Hi a to peak heights in the EPG into Hf a and H® a so that 

Ha = H- a + H° a 

where Hf a represents the DNA originating from individual i of allelic type 
a that stutters to a — 1 by losing a repeat number, and H® a represents the 
remainder that goes through the PCR process undamaged. We let the com- 
ponents be independent and gamma distributed as 

H ta ~ r{p^n ia , rj}, H? a ~ T{p(l - £)<Mia, v}, 

where £ is the mean stutter percentage. The total peak height observed at 
allele a is then 

Ha = X] Hia + H la+1 = H ° a + H & a+1 . 

i i 

The total peak height is also gamma distributed as 

ff a ~r{pD a (^£,n),77} 

where 

D a {<f>, e, n) = (1 - O£a(0, n) + 6B a +i(0, n) (2) 

are the effective allele counts after stutter. The relative contribution lost to 
stutter 

X = H " 
a H s + H° 

a 1 a 

thus follows a beta distribution B{^pB a ((p, n), (1 — £)p£> a (0, n)} with mean 
E(X a ) = £. 



15 



Notice that our use of stutter percentage X a differs from standard practice 



which uses the ratio between the stutter peak and the parent peak (Butler 



2005 ). A peak in stutter position may itself include contributions from proper 
alleles, and the parent peak may contain stutter contributions from an allele 
with higher repeat number. Even in the simple event that a peak is due to 
stutter alone from a parent peak which has not itself been inflated by stutter 
we have H a = H® and H* = H a _i and thus our X a = H a _i/ (H a _i + H a ) 
rather than the conventional H a _i/H a . 

Dropout The next step in our model is concerned with the fact that in 
mixtures, in particular with small amounts of DNA, some alleles are not 
observed at all in the sense that no peak can be identified or the peak is 
below the chosen threshold. Here we take the consequence of the model and 
represent dropout by a total peak height below a threshold C, as described 
in Section 11.11 In other words, we are not observing the height H a as just 
described, but rather Z a , where 



Z a 



H a if H a > C 

otherwise. 



This implies that the probability that a specific allele is not observed is 

P(Z a = 0) = G{C; pD a (4>, C, n), 77}, (3) 

where G denotes the cumulative distribution function of the gamma distri- 
bution. 

The probability of allellic dropout was described by |Tvedebrink et aL 



(2009, 2012a) as a linear logistic function of the logarithm of peak height: 



^ = °i fc ) = rr^ (4) 

where /3 = —4.35 for all markers and a was marker dependent. Since they 
used observed average peak height above a threshold as independent variable 
rather than expected peak height, this relation is not directly comparable to 
ours. 

Ignoring the difference and letting h = E(H a ) = pD a (<f), £,11)77 we ma y 
compare the dropout models. Fig. [2] shows a selection of curves Q for (3 = 
—4.35 and representative values of a with curves ^ for C = 50 for values 
of 77 corresponding to the maximum likelihood estimate and upper and lower 
99% confidence limits in our case example, see Section [3] The dropout model 
based on the gamma distributon tends to have lower dropout rates than the 
logistic model for small amounts of DNA. 
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Figure 2: The probability of dropout of a single allele as a function of mean 
peak height. The curves with full lines correspond to our gamma model 
whereas the dashed curves correspond to the logistic model. 



Both dropout models implicitly define a relation between the dropout 
probability d for a single allele and the corresponding dropout probability 
D for a homozygote. Balding and Buckleton (2009) argue that these should 
satisfy D < d 2 . This relation is satisfied by any threshold model — hence 
also for our gamma model — since D = P(Yi + Y 2 < C) < P{Y\ < C) 2 if 
Y\ and Y 2 are independent and identically distributed. For the model Q we 
have 

ah?; = oc{2hf = 2? : 



d 



D 



d 



and thus 



2?d 



D = l + (2?-Dd (5) 

independently of the value of a. The curve in ^ is displayed in Fig. [3] for 
(3 = —4.35 together with the upper bound D = d 2 and a selection of curves 
from the gamma model with C = 50 and the same values for t] as in Fig. [2} 
We note that although ^ actually just crosses the curve D = d 2 for very 
small values of d, this has hardly any practical significance, as also pointed 
out in 



Tvedebrink et al. (2012a). 



Other artefacts We have chosen to represent spontaneous dropin of alleles 
by additional unknown contributors with very small amounts of DNA, im- 
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Figure 3: The probability of homozygous dropout D as a function of the 
dropout probability of a single allele for the gamma model and for the logistic 
model of Tvedebrink et al. ( 2009 ) . The solid curves correspond to the gamma 
model with rj varying from left to right as 40, 28, and 16. 



plying that most of their alleles will drop out with high probability. With the 
efficient computational methods described in Section 12.4} this representation 
is amply feasible. The possibility of a silent allele is easily incorporated into 
the model, simply by adding an allele that never results in an observed peak, 
see Section 13.31 for details. Note that we have modelled dropout entirely as a 
threshold phenomenon, ignoring the fact that apparatus failure could be an 
alternative explanation. 



2.3 Joint likelihood function 

For given genotypes of the contributors n, given proportions <j), and given 
values of the parameters (p, £,77), all observed peak areas are independent. 
Thus, the conditional likelihood function based on the observations z = 
{zma}meA4,aeA m for all markers m and alleles a is 

HP, £> 0> V I Z > n ) = II II L rna(z ma ) 

where 



m a 



g{z ma ; pD a (4>, £, n), 77} if z ma > 
G{C; pD a (<p, £, n), rj} otherwise, 
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with g and G denoting the gamma density and cumulative distribution func- 
tion respectively and D a are the effective allele counts after stutter in ([2]). In 
the above formulae we have suppressed potential marker dependence of p, rj, 
and £ for simplicity of notation. 

For a given hypothesis H, this be that of the prosecution or that of the 
defence, the full likelihood is obtained by summing over all possible com- 
binations of genotypes with probabilities associated with the hypothesis to 
give 



L(H) = Pr(E | #) = £ Hp, MM z, n)P(n | H). 



(6) 



The number of terms in this sum is huge for a hypothesis which involves 
several unknown contributors to the mixture, but can be calculated efficiently 
by Bayesian network techniques (Graversen, 2013b), see also Section 



Estimating unknown parameters The likelihood function L(H) involves 
a number of parameters (p, £, 0, rf) which may be completely or partially un- 
known. One way of dealing with this is to choose parameters associated 
with each of the competing hypotheses that make the likelihood as large as 
possible and thus calculate 

l(h)= sup Y,Hp,tAM*MPMH) 

corresponding to using maximum likelihood estimates for the unknown pa- 
rameters, but different estimates under the competing hypothesis. Exploit- 
ing that the likelihood function ([6]) can be efficiently computed ensures that 
it can be maximized numerically using appropriate standard optimization 
methods. 

Using maximized likelihood ratios preserves the property that evidence 
against a suspect K s in a mixed trace based on our model can never be 
stronger than the inverse match probability against K s obtained by a match- 
ing single source DNA profile. To see this, let tc s = Pv(U = K s ) denote 
the match probability, i. e. the probability that a random individual has the 
genotype of the suspect K s . As before, let H p be the prosecution hypothesis, 
involving K s as a contributor and H d be the defence hypothesis, replacing 
K s by a random individual U. Let further ip p = (p p , £ p , <p p , fj p ) be the maxi- 
mum likelihood estimates of the unknown parameters under the prosecution 
hypothesis and ipd the estimates under the defence hypothesis. We then have 

Pr(ff |ff p ,4) < Pr(ff |ff p ,4) < 1 
Pr(E\H d J d ) ~ Pr(E\H d J p ) ~ vr s ' 
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where the last inequality is obtained as in nty. Thus, also when parameters 
are estimated, a mixed trace can never give stronger evidence than a high- 
quality trace from a single source. 



2.4 Computation 

As mentioned in Section 12.31 the main computational challenge is associated 
with marginalising over all unknown DNA profiles which is needed for eval- 
uating the likelihood function ([6]). This is particularly challenging when the 
number of contributors increases, as the number of terms in ^ becomes 
intractably huge so brute force summation is impossible. To enable efficient 
computation of the sum, we represent the genotypes of the unknown contrib- 
utors and the observations with appropriate Bayesian networks. 

As markers are independent in the model for fixed model parameters, the 
model can be represented by one Bayesian network per marker. Also, un- 
known contributors are assumed unrelated, corresponding to their genotypes 
being independent. We describe here how a genotype for an unknown con- 
tributor can be modelled by a Bayesian network, and briefly mention how to 
combine them into one network whilst also allowing the possibility of includ- 
ing the observed peak height information. Further details on the networks 
and associated computations are described in Graversen (2013afb). 



Representation of genotypes As usual we assume a reference popula- 
tion in Hardy- Weinberg equilibrium so we can consider the two alleles of 
an individual chosen at random with allele-frequencies (qx, . . . , g^)- We re- 
call from Section 12.21 that the distribution of peak height at allele a depends 
on both riia and Ui A+ i for both known and unknown contributors i. To 
enable simple computation of the terms in the likelihood function, we there- 
fore represent the genotype for contributor j by a vector of allele-counts 
(n^i, • • • ,niA)- These vectors then follow independent multinomial distribu- 
tions with Yh n ia = 2. 

Using properties of the multinomial distribution we can describe this dis- 
tribution sequentially as follows. The number nn of the first allelic type 
follows a binomial distribution Bin(2,gx). Denote by S ia = J2b<a n ib the 
number of alleles of type 1 to a that contributor i possesses. For any sub- 
sequent allelic type a + 1, it holds that, conditionally on Si a , the number of 
alleles of type a + 1 is independent of the specific allocations (na, . . . ,n ia ) 
already made. Furthermore, rii, a+ i is also binomially distributed, n^ a+ i ~ 
Bin(2 — S ia , q a+ i/ J2b>a+i lb)- Thus, adding the partial sums of allele counts 
to the network allows the genotype to be modelled by a Markov structure as 
displayed in Fig. |4j 
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Figure 4: Bayesian network modelling the genotypes of 2 unknown contrib- 
utors i and j for a marker with 6 possible allelic types. 



Including peak height information To incorporate information about 
peak height observations, we introduce binary nodes O a in Fig. [4] which rep- 
resents whether a peak for allele a has been seen. By suitable specification 
of the conditional distribution of O a given its parent nodes, conditioning on 
O a being TRUE or FALSE will correspond to conditioning on Z a ; we refrain 
from giving the details here, but refer to Graversen ( 2013afb ). 

The introduction of observational nodes O a with parent nodes within all 
unknown contributors renders the computational effort exponential in the 
number of unknowns. However, the Markov representation of the genotypes 
themselves ensures the complexity to grow linearly with the number of allelic 
types A rather than quadratic as in previously used representations. 



3 Case analysis 



We proceed to illustrate the methodology developed on the case described 
in Section 11.31 For uniformity we have used a threshold of C = 50 for both 
traces MC15 and MC18, although the data for MC18 contain peaks below 
that threshold, more precisely peaks for marker FGA at alleles 21 and 25 of 
height 49 and 39 respectively. Using a threshold of C = 38 for this profile 
makes a negligible difference to the results. 

The computations were all performed with the R-package DNAmixtures 



(Graversen 2013a) which interfaces HUGIN API (Hugin Expert A/S 2012) 



through the R-package RHugin (Konis 



Rsolnp (Ghalanos and Theussl 



2012 



2012). The package DNAmixtures uses 



Ye 1987) for numerical maximization 
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Table 2: Maximum likelihood estimates based on MCI 5 and their approx- 
imate standard errors based on inverting the Hessian of the log-likelihood 
function. 



Prosecution hypothesis 


Defence hypothesis 


Parameter 


Est. 


SE 


Parameter 


Est. SE 




913.1 


35.0 




914.4 40.6 


a 


0.171 


0.018 


a 


0.198 0.023 


£ 


0.074 


0.014 


t 


0.072 0.019 


<i>Ki 


0.821 


0.020 




0.798 0.028 


<t>K 2 


0.047 


0.014 


<PK 2 


0.039 0.019 


<t>K a 


0.124 


0.015 


<f>U! 


0.081 0.013 


4>u 


0.008 


0.018 


(f>U 2 


0.081 0.013 


log 10 L(#) 




118.042 


log 10 L(H) 


-129.335 



of the likelihood functions and the approximate standard errors of estimates 
are based on the inverse Hessian of the likelihood function found by numerical 



derivation using numDeriv (Gilbert and Varadhan, 2012). The population 



allele frequencies used were taken from the US Caucasian database in Butler 



et al. (2003). 



3.1 Strength of evidence 

Analysis of MCI 5 The maximum likelihood estimates and their approxi- 
mate standard errors under the defence hypothesis Ha : KiSzK 2 SzUi&:U2 and 
prosecution hypothesis H p : K\h:K2h,Kj,hJJ are given in Table The last 
unknown contributor is mainly included to allow for potential spontaneous 
dropin. We note that the estimates for the parameters of the PCR process 
(/i, cr, £) are very similar under the two hypotheses, the defence hypothesis 
suggesting a slightly larger coefficient of variation, both a little less than 20%. 
The mean stutter percentage £ is estimated to just under 7.5%. The esti- 
mates of the contributor fractions <fi under the prosecution hypothesis agree 



well with the estimates (0.80,0.06,0.12,0.04) found in Table 3 of Gill et al. 



(2008). The resulting likelihood ratio is log 10 (Li?) = 11.29. For the prosecu- 
tion hypothesis we can ignore spontaneous dropin, the standard deviation of 
4>u indicating this may well be zero. Under the defence hypothesis there are 
difficulties distinguishing the two unknown contributors in that their contrib- 
utor fractions are estimated to the same value. Since the estimates for (f>jj. are 
on the boundary of the parameter space, their standard errors are given for 
the model which further restricts them to be equal as <fiu 1 = <pu 2 . Removing 
one unknown contributor for both hypotheses increases the likelihood ratio 
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Table 3: Maximum likelihood estimates based on MC18 and their approxi- 
mate standard errors. 

Prosecution hypothesis Defence hypothesis 



Parameter 


Est. 


SE 


Parameter 


Est. 


SE 


A* 


1056 


39.3 


A* 


1056 


40.7 


a 


0.166 


0.017 


a 


0.172 


0.019 




0.085 


0.016 


t 


0.085 


0.017 




0.706 


0.022 


<t>Kj 


0.698 


0.026 




0.091 


0.016 




0.096 


0.018 




0.194 


0.018 


<t>U x 


0.193 


0.021 


4>u 


0.009 


0.018 




0.013 


0.021 


log w L(H) 




130.092 


log 10 L(H) 




143.362 



to \og 10 (LR) = 12.12. This should be compared to the analysis in Cowell 



et al. (2011) which for hypotheses with three contributor gave a \og w (LR) 



around 5 and to an inverse match probability for K% of — log 10 ttk 3 = 14.45. 
Thus our more accurate model gives considerably stronger evidential value 
than previously found, although much less than a perfect single source trace. 
Indeed, the loss of efficiency compared to a single source trace is similar to 
the loss from leaving out, for example, markers D18 and D19; see Section 11.51 



Analysis of MC18 The maximum likelihood estimates and their approxi- 
mate standard errors under the defence and prosecution hypotheses are given 
in Table 1 

We note again that the estimates for the parameters of the PCR pro- 
cess (fjL,cr,£) are very similar under the two hypotheses, the defence hy- 
pothesis suggesting a slightly larger coefficient of variation, but the same 
order of magnitude as for MC15. Also here the estimates of the contributor 
fractions under the prosecution hypothesis agree well with the estimates 
(0.67,0.11,0.19,0.04) in Table 2 of |Gill et al.] ( [2008| . This time the defence 
hypothesis identifies an unbalanced contribution from the unknown contrib- 
utors, the major unknown contributor representing a fraction similar to that 
of the defendant K 3 under the prosecution hypothesis. The estimates of the 
mean stutter percentages are slightly larger than found for MC15, although 
the standard errors suggest that the differences are well within what could be 
expected from random variation. The resulting likelihood ratio for this trace 
becomes log lQ (LR) = 13.27. Both hypotheses indicate that one unknown 
contributor can be removed which decreases the likelihood ratio slightly to 
log 10 (L_R) = 13.21. We compare again to Cowell et al. (2011) which gave 
a log 10 (Li?) around 9, again a much weaker evidential value than what we 
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Table 4: Maximum likelihood estimates when information in the traces 
MC15 and MC18 is combined. 

Prosecution hypothesis Defence hypothesis 

MC15 MC18 MC15 MC18 



Par. 


Est. 


SE 


Est. 


SE 


Par. 


Est. 


SE 


Est. 


SE 


1* 


914 


35.7 


1055 


38.3 


1* 


914 


36.2 


1055 


38.9 


a 


0.175 


0.013 


0.163 


0.012 


a 


0.177 


0.013 


0.165 


0.013 




0.079 


0.011 


0.079 


0.011 




0.079 


0.011 


0.079 


0.011 




0.822 


0.020 


0.705 


0.021 


<t>Kj 


0.820 


0.021 


0.702 


0.023 


<t>K 2 


0.048 


0.014 


0.090 


0.016 


4>K 2 


0.049 


0.014 


0.092 


0.017 


4>K 3 


0.124 


0.016 


0.193 


0.018 


4>u! 


0.123 


0.016 


0.193 


0.018 


4>u 


0.006 


0.018 


0.012 


0.017 


<f>u 2 


0.008 


0.019 


0.014 


0.018 


logio 


L(H) 






248.214 


logio 


L(H) 






262.297 



obtain from the more refined model which is similar to having just lost a 
single marker: losing D18 yields a match probability of — log^^ = 13.00. 

Combined analysis of MC15 and MC18 For a combined analysis of the 
two traces there is a range of possibilities. The simplest combined analysis 
assumes both for the prosecution and defence that the unknown contributors 
to the two traces are different and unrelated. Under this assumption, the 
traces are simply independent and the joint likelihood ratio is obtained by 
multiplying the two likelihood ratios found above to yield overwhelmingly 
strong evidence at log lQ (LR) = 24.56. However, this hypothesis is very 
unfavourable for the defence as it is much more likely that the unknown 
persons are identical for the two traces. 

As the traces are dependent when unknown contributors are shared, a 
fresh calculation is needed to find the likelihood ratio. In this calculation 
we assume that the mean stutter percentage £ and the scale parameter 77 
are identical for the two PCR profiles. This seems reasonable since these 
parameters refer to the PCR processes rather than the traces themselves. In 
addition we assume that the unknown contributors are the same for the two 
profiles, yielding the results displayed in Table |H 

We note that the standard errors in Table |4] are not dissimilar to pre- 
vious values, apart from the standard errors of a and the common mean 
stutter percentage £, which have been reduced considerably by combining 
information from the two traces. The estimates for most parameters are 
strikingly similar under the two hypotheses; the fraction attributed to the 
defendant K 3 under the prosecution hypothesis is attributed to the major 
unknown contributor JJ\ under the defence hypothesis. The interpretations 
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under the prosecution and defence hypotheses only disagree on the identity of 
this contributor. The resulting likelihood ratio for the combination of traces 
becomes log 10 (Li?) = 14.09. This evidence has strength similar to that of a 
single source trace with a perfect match to the defendant K3 which yields an 
inverse match probability of — log 10 tck 3 = 14.45. Thus the specific evidential 
efficiency of the combination of the two traces against K% is about 44% and 
the loss of efficiency is thus much smaller than losing a single marker. Also 
the combined analysis indicates that one unknown contributor is redundant 
and could be removed from the analysis. However, this time such removal 
only changes the likelihood ratio to log 10 (Li?) = 14.08. 



Cowell et al. (2011) gave a value of log 10 (Li?) = 8.86 using their model 



indicating a much less efficient analysis of the traces. For curiosity we have 
also run likeLTD on these two traces and obtained log 10 (Li?) = 8.81. This 
number is not directly comparable to ours as it refers to a slightly different 
database of gene frequencies and a coancestry parameter F$t = 0.02 is used, 
giving a larger match probability — log 10 ttk 3 = 13.15. Also, likeLTD assumes 
that the contributor fractions are identical for the two traces since they are 
treated as multiple runs of the same (LTD) trace. Still it seems fair to 
conclude that properly exploiting the information in peak heights gives a 
more efficient analysis of the traces than what can be obtained from allele 
classifications alone; this could be important for the analysis of mixtures 
more complex than those analysed here. 



3.2 Mixture deconvolution 

For mixture deconvolution we consider the traces jointly under the defence 
hypothesis (modified by removing the redundant unknown contributor) and 
identify the five most probable full genotypes of the unknown individual U. 
For most markers, these genotypes share that of the defendant K 3 ; indeed 
the only variations are on markers D16, D18, and D19, where the most prob- 
able configurations are displayed in Table HI The markers not represented 
in the table share the genotype of the defendant for all five most probable 
combinations. We note that the genotype with the top rank is indeed the 
genotype of the defendant, and the evidence gives a probability to the un- 
known person having precisely this genotype of 44%, equal to the evidential 
efficiency of the combined traces, see Section fl~5l The probability that the 
true genotype of the unknown contributor is among the five genotypes listed 
is 73%. 

If we make a deconvolution of the single traces, the identification of the 
genotype of U is less clear, as displayed in Table |6] and Table El The proba- 
bilities of the most probable genotypes are drastically reduced. We note in 
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Table 5: Most probable genotypes of the unknown contributor U under the 
modified defence hypothesis H' d : K\izK2^dJ when information in the traces 
MC15 and MC18 is combined. 



Rank 




Marker 




Probability 


D16 


D18 


D19 


1 


(11,13) 


(12,16) 


(14,15) 


0.436 


2 


(11,13) 


(12,14) 


(14,15) 


0.129 


3 


(11,13) 


(12,16) 


(13,15) 


0.070 


4 


(11,13) 


(12,16) 


(13,14) 


0.052 


5 


(12,13) 


(12,16) 


(14,15) 


0.047 


Total 








0.734 



Table 6: Most probable genotypes of U under the modified defence hypoth- 
esis H' d : K x hK 2 MJ based on MC15 only. 



Rank 




Marker 




Probability 


D16 


D18 


D3 


1 


(12, 13) 


(12, 16) 


(15,19) 


0.00233 


2 


(12, 13) 


(12, 16) 


(17,19) 


0.00217 


3 


(11, 13) 


(12, 16) 


(15 19) 


0.00209 


4 


(11, 13) 


(12, 16) 


(17,19) 


0.00195 


5 


(12, 13) 


(12, 15) 


(15,19) 


0.00143 


Total 








0.00997 



particular the extreme uncertainty for MC15 in isolation. Also, for the single 
trace deconvolutions, the genotype of the defendant is ranked fourth for both 
traces rather than first. 



3.3 Interpreting artefacts 



Our model does not impose at the outset that a specific peak or allele is 
due to stutter, or has dropped out, or is silent. One of the features of 



DNAmixtures (Graversen, 2013a) is to produce Bayesian networks for each 



marker with peak height evidence propagated. Thanks to the flexibility of 
Bayesian networks we can then elaborate these so as to identify the pos- 
sibility that artefacts might be present in some of the markers. For this 



purpose, we use object-oriented Bayesian networks (OOBN) (Dawid et al. 



2007). We introduce class networks that represent the presence of stutter 
and dropout. Simple modifications of the networks can also be made to al- 
low for the possible presence of silent alleles. Instances of the networks are 
then combined with the networks produced by DNAmixtures. In this way, 
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Table 7: Most probable genotypes of U under the modified defence hypoth- 
esis H' d : K x hK 2 MJ based on MC18 only 



Marker 



Rank 


D18 


D19 


Probability 


1 


(12,14) 


(13,14) 


0.108 


2 


(12,16) 


(13,14) 


0.082 


3 


(12,14) 


(14,15) 


0.082 


4 


(12,16) 


(14,15) 


0.062 


5 


(12,14) 


(14,14) 


0.053 


Total 






0.387 



we can answer queries like: what is the probability that an allele is due to 
stutter? that an allele has dropped out? that there is a silent allele in the 
unknown contributor(s) genotype to the mixture? 

Identifying stutter and dropout We use teletype to indicate nodes 
and bold to indicate an instance of a network class. Fig. [5] shows the global 
OOBN used for artefact identification. The analysis in Section [^produces the 




Figure 5: OOBN for artefact identification in marker FGA 



network named marker FGA. We now elaborate this network by designating 
relevant nodes as output nodes: these are n_l_a and n_2_a representing the 
total allele counts for a specific allele a for JJ\ and U2, and the Boolean 
nodes CLa indicating whether allele a is observed or not, as described in 
Section [2~4l Network classes for stutter and dropout are now attached by 
identifying relevant output nodes in marker FGA with appropriate input 
nodes. The class networks for stutter and dropout are shown in Fig. [6| The 
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Figure 6: Networks for stutter and dropout. 




Figure 7: Modified network allowing for silent alleles. The nodes in the 
shaded area are those of the original network. 



networks reflect the defence hypothesis Ha : K\hK 2 <kJJ\<kJJ 2 and represent 
deterministic functions of the input nodes n„l, n_2, 0, as well as Kl and K2, 
the latter pair of nodes indicating the number of a alleles possessed by the 
known contributors K\ and K 2 . An observed peak is due to stutter if is 
TRUE and Ki, K 2 , U\ and U 2 do not have an a allele. This is represented in 
Fig. [6] by the Boolean node stutter? which depends on the founder nodes 
via a series of conjunctions. An allele has dropped out if is FALSE and 
at least one out of K 1 ,K 2 ,Ui or U 2 has allele a. This is represented by the 
Boolean node dropout? in Fig. [6] which now depends on the founder nodes 
via a series of disjunctions. 

Silent alleles The possibility that the unknown contributors might have 
a silent allele is incorporated in the model by adding an extra allelic type 
(silent) to the genotype representations as displayed in Fig. [7) We now 
let n i0 ~ Bin(2,g ), S i0 = n i0 , Sn = nu + S iQ , and further na \ S i0 ~ 
Bin(2 — S i0 ,qi), where q is the probability of an allele being silent and 
q± indicates the database allele frequency. The rest of the network remains 
unchanged and observational nodes are added as in Fig. [4] although the node 
riio representing silent alleles has no observational children. This corresponds 
to a modification of the basic model where the gene frequencies q a , a > are 
interpreted as the relative frequencies of non-silent alleles, so the probability 
that a random allele is of type a is equal to (1 — qo)q a if a > 0. 
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Table 8: Posterior probabilities for alleles to have dropped out and be only 
stutter for trace MC18. 



Marker(a) 


P(dropout) 


P(stutter) 


P(n la = 0) 


P{n 2a = 0) 


D2(22) 




0.927 


1 


0.927 


D3(14) 




0.820 


1 


0.820 


D3(16) 




0.472 


1 


0.472 


D3(19) 










0.981 


D19(12) 




0.828 


1 


0.828 


FGA(21) 


1 





1 


0.683 


FGA(22) 







1 


0.597 


FGA(25) 


0.118 





1 


0.882 



Combined analysis of MC15 and MC18 Using the parameter esti- 
mates in Table H] under the defence hypothesis Hd : K\hK 2 <kXJ\<kXJ'i and 
using the networks described above, we obtain the results in Table [8] 

The third column of the table shows the posterior probabilities in trace 
MC18 for selected alleles to be due to stutter only. Note that the probability 
that D3(16) is due to stutter only is 0.472; the peak height for this allele is 



67 rfu and thus 14% of its parent peak D3(17) at 479 rfu flGill et al.[ [20081 ), 
rendering it rather likely that it may also represent a proper allele. From 
the fifth column we note that the probability that U 2 possesses one or two 
D3(16) alleles is 1 — 0.472 = 0.528. This judgement differs from the result in 



Table 7 of Cowell et al. (2011) where the probability of stutter was equal to 
1, but under a defence hypothesis with one less unknown, hence not allowing 
for the peak being affected by a second unknown contributor. 

Note that D3(14) has a rather high probability of being a stutter peak. 
If evidence that D3(14) is due to stutter is propagated in the network, the 
posterior probability that U 2 possesses one or two D3(14) alleles decreases 
from 1 — 0.82 = 0.18 to zero. This shows that a priori classification of alleles 
as stutter can have drastic consequences. 

The combined analysis of two traces gives more informative results on 
the analysis of artefacts than what emerges from separate analyses of the 
two traces. We abstain from reporting these analyses here. 

Using the combined information from traces MC15 and MC18, the pos- 
terior probabilities that U\ and U 2 have zero or one silent allele in MC18, 
when q = 0.01 are given in Table HI The posterior probability of U\ and 
U2 having no silent alleles is close to 1 for all markers, except for D19 when 
for U\ it is 0.95, thus rendering the presence of a silent allele more probable 
based on the observed peak heights. 
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Table 9: Posterior probabilities that U\ and U2 have no silent allele when 
go = 0.01. 

silent? D2 D3 D19 FGA 
P{n 10 = 0) 0.999 1 0.947 0.962 
p{n 20 = 0) 0.978 0.983 0.978 0.973 



Table 10: Excerpt of a DNA profile with peak heights of a single individual. 



Marker 


Allele 


Height 


D8 


13 


4364 


D21 


28 


2646 




29 


2490 


CSF1PO 


11 


2695 


D3 


14 


2249 




16 


2205 


TH01 


6 


1268 




9.3 


1294 


TPOX 


8 


1394 




11 


1526 


AMEL 


X 


4289 


D5 


11 


2053 




13 


1827 


FGA 


23 


2444 



3.4 Single source analysis 

Although our model is developed for the analysis of mixtures, it can also be 
useful in the analysis of a DNA profile from a single source; in particular 
as the peak heights can be informative about the presence of silent alleles. 
What appears to be a homozygous genotype at some marker may not be so; 
an alternative explanation is that we see only one allele of a heterozygous 
genotype, the other allele being silent. We would expect this to be reflected 
in a peak height that is smaller than expected and it will clearly affect the 
evidential interpretation of DNA profiles. 

Table [10] shows an excerpt of a DNA profile of a single individual J, 
having apparent homozygous genotypes on markers D8, CSF1PO and FGA. 
Note that the peak heights are large, as customary in single source data 
where there is a large amount of DNA. 

Assuming that the DNA profile in Table [10] came from two unknown 
contributors and setting the threshold C = 50, we obtain estimates for the 
fraction of DNA from each unknown contributor of (0i,02) — (1)0) which 
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clearly indicate that the DNA trace comes from a single individual. The 
estimates of the other parameters are fi = 1806, a = 0.290, £ = 5.5 x 10~ 9 . 
The fact that £ ~ clearly picks up that the data have been preprocessed 
and that alleles classified in the laboratory as stutter have been removed. If 
we instead set the threshold at C — 500 to accommodate the preprocessing 
and assume that the data are from a single individual, we get the estimates 
fx = 1854, a = 0.278 and £ = 0.038, but clearly there is very little information 
about the stutter percentage when stutter peaks have been removed in the 
preprocessing, resulting in very flat profile likelihood as shown in Fig. |SJ 
indeed a 95% confidence interval for £ would be ranging from to 11.8%. 




— i 1 1 1 — 

0.00 0.05 0.10 0.15 



Figure 8: Profile log-likelihood max ft(T log 10 L(fj,, a, £) — log 10 L(fi, a, £) for 
£ for the single source trace thresholded at C = 500. Values of £ with log- 
likelihood above the horizontal line constitute an approximate 95% confidence 
interval. 

Table Qj] gives the posterior probabilities that J has a silent allele when 
go = 0.01 and go — 0.1, for markers showing apparent homozygosity D8, 
CSF1PO, and FGA. Note that for genotypes where the peak height is rela- 
tively high like D8, the probability of a silent allele is almost zero, as would 
be expected. Markers CSF1PO and FGA have peak heights around 60% 
and 56% that of D8. The posterior probability of having a silent allele for 
CSF1PO is about 3% when g = 0.01, but rises to 27% when g = 0.1, 
whereas for FGA the corresponding posterior probabilities are 21% when 
g = 0.01, and 74% for g = 0.1 which are considerable. However, as dis- 
cussed further in Section EH care should be taken with interpreting this result 
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Table 1 1 : Posterior probabilities that the individual has a silent allele when 
go = 0.01 and go = 0.1. 



Marker 



go = 0.01 g = 0.1 



D8 

CSF1PO 
FGA 



0.0001 0.0009 
0.0333 0.2749 
0.2073 0.7421 



as some investigations indicate that mean peak heights are marker dependent; 
FGA often has a lower peak height than other markers. Evidence calcula- 
tions and deconvolutions may be less sensitive to this fact, but it could be of 
crucial importance for the identification of silent alleles. 

4 Discussion 

On the previous pages we have described a simple and versatile model for 
description of mixed DNA profiles and demonstrated how it could be applied 
to case analysis. We should in particular emphasize that all our computations 
are exact (up to numerical precision) and direct consequences of the model we 
have used. Clearly, the model itself represents an approximation to reality, 
as any model will, but we believe it is a virtue that any further analysis 
- whether this be computation of likelihood ratios, deconvolution, or trace 
interpretation — does not need additional approximations or heuristics. Also, 
our method does not rely on a subjective pre-processing of the DNA profile 
with a manual or automatic allele calling that may introduce an additional 
source of error. 

In addition to the case discussed in this article, we have tried our method- 
ology on a number of challenging cases, including diffficult LTD mixtures, 
with very sensible and robust result and believe that the methodology is now 
sufficiently developed for use in proper case work. However, for any method- 
ology there is room for improvement and below we shall briefly discuss some 
issues worthy of further attention. 

4.1 Number of contributors 

One issue is associated with determining the number of unknown contributors 
to the trace. The traces considered need at least three unknown contribu- 
tors to be well explained. We have in the previous analyses introduced an 
additional unknown contributor to allow for spontaneous dropin, but showed 
that such an additional contributor had most likely only contributed tiny 
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amounts of DNA to the trace, if anything at all. The existing literature 
has had limited focus on this issue, typically using a heuristic argument and 
not added additional contributors unless they appeared to be necessary for 
explaining the DNA profile in question. 

However, as we operate with maximized likelihoods under hypotheses 
that there is at most a fixed number k of contributors, it is obvious that 
the maximized likelihood will increase — or at least not decrease — with 
the number of unknown contributors, both for the defence and prosecution. 
This follows from the fact that the hypothesis of at most k contributors is 
a sub-hypothesis of that with k + 1 contributors, so we are maximizing over 
a larger space when increasing k. The phenomenon is illustrated in Fig. |9j 
which shows the maximized likelihoods and corresponding ratios for up to 
eight contributors, i.e. six unknown contributors for the defence hypothesis. 
The likelihoods and ratios are normalized relative to what was obtained for 




~i 1 1 1 1 r 

3 4 5 6 7 8 

Number of contributors 



Figure 9: The maximized log- likelihood functions based on combining traces 
MC15 and MC18 for varying number of contributors and hypotheses, relative 
to their value for three contributors and displayed on a log 10 scale. 

three contributors. The likelihood does indeed increase and it increases more 
for the defence than for the prosecution, so the likelihood ratio decreases 
with the number of contributors. The change is tiny: for eight contributors 
the likelihood ratio has decreased from log 10 (Li?) = 14.09 for three contrib- 
utors to log 10 (Li?) = 14.04, hardly of any significance. Even if the defence 
is allowed eight contributors and the prosecution stays with three, we get 
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log 1Q (LR) = 13.69, which is still about what would be obtained for a perfect 
match on nine markers and in most cases sufficient for identification of K 3 
beyond reasonable doubt. 

Inspection of the maximum likelihood estimates reveals that the five un- 
known contributors with smallest amount have between them supplied less 
than 3.7% of the total amount of DNA shared in equal proportions, i.e. each 
about 0.75% of the total amount of DNA. Bearing in mind that the total 
solution of DNA contains about 100 human cells, this means that effectively 
all of their alleles (which are not shared with the three major contributors or 
in stutter position to those) have not been observed. The unknown contrib- 
utors just serve the purpose of explaining minor peak imbalances and seem 
irrelevant for the interpretation of the trace. 

We believe that the phenomenon of ever increasing likelihoods is an arte- 
fact of using direct maximum likelihood without a penalty for complexity for 
estimating the unknown parameters. There are various ways of overcoming 
this problem: for example, one could use a convention that unknown con- 
tributors are only added if their presence is statistically significant at some 
fixed level; one could maximize a likelihood that is penalized for having many 
contributors; or one could postulate a uniform prior distribution for the un- 
known fractions <ft an d use an integrated rather than maximized likelihood. 
All of these would be acceptable but will be essentially ad hoc so would need 
to be established by an agreed convention. We point out that the methodol- 



ogy in Lauritzen and Mortera (2002) would not be applicable as this is based 
on absence of dropout; the problem here is that one could have an infinite 
number of contributors with so little DNA that they have only been observed 
through minor perturbations of the peak heights. 

4.2 Extensions and modifications 

The analyses presented in this paper have been made with the simplest pos- 
sible variant of the model with sensible and rather robust results. However, 
there are some obvious potential improvements to be made. 

Marker dependence of parameters We have assumed that the PCR 
process parameters (/i, a, £) are identical for all markers. This clearly contra- 



dicts other findings in the literature (Butler, 2005 Puch-Solis et al. , 2012) 



which indicate that both mean peak heights and stutter percentages depend 
on marker, fragment length, and may also be different for different lanes in 
the EPG. Variation in these parameters can easily be incorporated in the 
model; as this will increase the number of parameters, it would be best to do 
so by using prior information from laboratory experiments, for example by 
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scaling peak heights as indicated in Puch-Solis et al. (2012 ) or by adding prior 
information by penalizing the likelihood function to ensure that parameter 
estimates are near those expected. 



Degradation For strongly degraded DNA samples it is necessary to adjust 



the model by correcting for fragment length. From the analysis in Tvede- 



brink et al. (2012b) it seems that the simplest and most natural way to do 
so would be to assume the parameter p in the gamma distribution — be- 
ing proportional to the amount of DNA — to depend exponentially on the 
fragment length A as 

logp = 5X + (. 

This would just add one additional parameter 5 representing the level of 
degradation of DNA to the model and would not in itself create specific 
difficulties for the methodology, although the complexity of the maximum 
likelihood estimation clearly increases whenever new parameters are added. 



Dropout We have chosen a simple approach to modelling dropout by tak- 
ing the consequence of our gamma model and using the probabilities for being 



below thresholds as dropout probabilities. The dropout model of Tvedebrink 



et al. (2012a) can be seen as using a logistic distribution rather than a gamma 



distribution for this purpose. The choice of the logistic model seems primarily 
motivated by the ability to analyse dropout data using standard software and 
may not in itself have a firm theoretical foundation. In any case it would be 
interesting to see whether any of the gamma and logistic models would best 
describe the experimental data behind Tvedebrink et al. (2012a). We note 
that since Tvedebrink et al. (2012a) uses average peak height as a proxy for 
the amount of DNA and the relation between peak height and amount may 
be different in LTD analysis, his parameter estimates may not be directly 
suitable for this purpose, whereas our estimation process would naturally fit 
the parameters to any specific trace in question. 



4.3 Other issues 

Model criticism An important point that we have not touched upon in 
the previous analyses is the need to validate the model used for any given 
dataset. We find that it is not enough to compare the likelihoods for two 
competing hypothesis if neither of them can be demonstrated to give a plau- 
sible explanation of the data at hand. Promising graphical methods for 
systematic validation (or the opposite) of a given model based on its inter- 
nal predictions and residual analysis are important and under development 
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(Graversen, 2013b). These methods exploit that the computational struc- 



ture of a Bayesian network naturally enables fast prediction of peak heights 



10 



based on partial information from the profiles. An example is given in Fig 
which shows the conditional probability transform of peak heights for MC15 
and MC18 under the prosecution hypothesis against quantiles of the uniform 
distribution, indicating an excellent fit of the gamma model. 
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Figure 10: Conditional probability transform of observed peak heights for 
MC15 and MC18 vs. uniform quantiles. 



Uncertain allele frequencies We have in the previous analysis treated 
the allele frequencies as fixed and known, although they in principle also are 
parameters which should be estimated. Using allele counts from databases 
directly as frequencies corresponds to estimating allele frequencies by maxi- 
mum likelihood and using the estimated frequencies as if they were known. 
This conforms with the way other parameters are treated in the present pa- 
per, but ignores the inherent uncertainty associated with the frequencies. It 



would be valuable to develop a scheme similar to that in Green and Mortera 



(2009) for incorporating both this uncertainty and kinship corrections into 



the methodology and we hope to address this in future work. 

Estimation uncertainty Although we have indicated the uncertainty of 
our parameter estimates by giving approximate standard errors based on the 
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Hessian of the log-likelihood function, we have neither made a detailed anal- 
ysis that justifies this, nor have we incorporated this additional uncertainty 
into our analysis. However it is eminently feasible to do so along the lines 
described in Graversen and Lauritzen (2011) and future work on this will be 
reported in Graversen (2013b). 
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